What is claimed is: 

1 . A method for transposing data in a plurality of processing elements, comprising; 
shifting the data along diagonals of the plurality of processing elements until the 

processing elements in the diagonal have received the data held by every other processing 
element in that diagonal; 

selecting data as final output data based on a processing element's position. 

2. The method of claim 1 additionally comprising one of loading an initial count into each 
processing element and calculating an initial count locally based on the processing element's 
location, said selecting being responsive to said initial count. 

3. The method of claim 2 wherein said plurality of processing elements is arranged in an 
array and said initial count is given by one of the following expressions: 

(x + y +1) MOD (array size) 
(C + R + 1) MOD (array size) 
(C + y + 1) MOD (array size) or 
(x + R +1) MOD (array size). 

4. The method of claim 2 additionally comprising maintaining a current count in each 
processing element, said current count being responsive to said initial count and the number of 
data shifts performed, said selecting being responsive to said current count. 

5. The method of claim 4 wherein said maintaining a current count includes altering said 
initial count at programmable intervals by a programmable amount. 

6. The method of claim 4 wherein said initial count is decremented in response to said 
shifting of data to produce said current count. 

7. The method of claim 4 wherein said selecting occurs when said current count is non- 
positive. 

8. The method of claim 1 additionally comprising maintaining a local count including 
setting a counter to a first known value, and counting up from said first known value based on 
the number of shifts that have been performed, said selecting occurring when a current count 
equals a target count. 

9. The method of claim 1 wherein said shifting includes a combination of vertical and 
horizontal shifting. 
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10. The method of claim 1 wherein said shifting includes a combination of shifting in the x 
and z directions. 

11. A method for transposing data in an array of processing elements, comprising: 
shifting the data along diagonals in the array a number of times equal to N-l where N 

equals the number of processing elements in a diagonal; 

outputting data from each processing element as a function of that element's position in a 
diagonal. 

12. The method of claim 1 1 additionally comprising one of loading an initial count into each 
processing element and calculating an initial count locally based on the processing element's 
position in a diagonal, said outputting being responsive to said initial count. 

13. The method of claim 12 wherein said initial count is given by one of the following 
expressions: 

(x + y +1) MOD (array size) 
(C + R + 1) MOD (array size) 
(C + y + 1) MOD (array size) or 
(x + R +1) MOD (array size). 

14. The method of claim 12 additionally comprising maintaining a current count in each 
processing element, said current count being responsive to said initial count and the number of 
data shifts performed, said outputting being responsive to said current count. 

15. The method of claim 14 wherein said maintaining a current count includes altering said 
initial count at programmable intervals by a programmable amount. 

16. The method of claim 14 wherein said initial count is decremented in response to said 
shifting of data to produce said current count. 

17. The method of claim 16 wherein said outputting occurs when said current count is non- 
positive. 

18. The method of claim 12 additionally comprising maintaining a local count including 
setting a counter to a first known value, and counting up from said first known value based on 
the number of shifts that have been performed, said outputting occurring when a current count 
equals a target count. 

19. The method of claim 1 1 wherein said shifting includes a combination of vertical and 
horizontal shifting. 
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20. The method of claim 1 1 wherein said shifting includes a combination of shifting in 
perpendicular directions. 

21. A method for transposing data in a plurality of processing elements, comprising: 
shifting data between processing elements arranged in diagonals; 

setting an initial count in each processing element according to one of the expressions: 

(x + y +1) MOD (array size) 

(C + R + 1) MOD (array size) 

(C + y + 1) MOD (array size) or 

(x + R +1) MOD (array size) 
modifying the initial count by a programmable amount and at programmable intervals to 
produce a current count; and selecting output data as a function of said current count. 

22. The method of claim 21 wherein said modifying includes counting down from said initial 
count. 

23. The method of claim 22 wherein said selecting occurs when said current count is a non- 
positive value. 

24. The method of claim 21 wherein said shifting includes a combination of vertical and 
horizontal shifting. 

25. The method of claim 21 wherein said shifting includes a combination of horizontal 
shifting. 

26. A memory device carrying an ordered set of instructions which, when executed, perform 
a method comprising: 

shifting the data along diagonals of the plurality of processing elements until the 
processing elements in the diagonal have received the data held by every other processing 
element in that diagonal; 

selecting data as final output data based on a processing element's position. 
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